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ABSTRACT 

We reproduce the galaxy clustering catalogue from the SDSS-III Baryon Oscillation 
Spectroscopic Survey Final Data Release (BOSS DR1I&DR12) with high fidelity on 
all relevant scales in order to allow a robust analysis of baryon acoustic oscillations and 
redshift space distortions. We have generated (6,000) 12,288 MultiDark patchy BOSS 
(DRll) DR12 light-cones corresponding to an effective volume of ^192, 000 [hr^ Gpc]^ 
(the largest ever simulated volume), including cosmic evolution in the redshift range 
from 0.15 to 0.75. The mocks have been calibrated using a reference galaxy catalogue 
based on the halo abundance matching modelling of the BOSS DR11&DR12 galaxy 
clustering data and on the data themselves. The production follows three steps. First, 
we apply the patchy code to generate a dark matter field and an object distribution 
including nonlinear stochastic galaxy bias. Secondly, we run the halo/stellar distribu¬ 
tion reconstruction hadron code to assign masses to the various objects. This step 
uses the mass distribution as a function of local density and non-local indicators (i.e., 
tidal field tensor eigenvalues and relative halo exclusion separation for massive objects) 
from the reference simulation applied to the corresponding patchy dark matter and 
galaxy distribution. Finally, we apply the SUGAR code to build the light cones. The 
resulting MultiDark patchy mock light cones reproduce the number density, selection 
function, survey geometry, and in general within l-cr, for arbitrary stellar mass bins, 
the power spectrum up to fc = 0.3/iMpc~^, the two-point correlation functions down 
to a few Mpc scales, and the three-point statistics of the BOSS DR11&DR12 galaxy 
samples. 

Key words: cosmology: methods: numerical - galaxies: haloes - galaxies: statistics - 
large-scale structure of Universe 


1 INTRODUCTION 

The observable Universe represents a unique realization of 
an underlying physical cosmological process. Large galaxy 
redshift surveys like the Baryon Oscillation Spectroscopic 
Survey (BOSS; e.g., Bolton et al. 2012; Dawson et al. 2013; 
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Alam et al. 2015), a branch of the ongoing Sloan Digital 
Sky Survey (SDSS-III Eisensteiir et al. 2011) scan the sky 
with unprecedented accuracy trying to unveil structure for¬ 
mation in an expanding Universe. One important question 
arises in the analysis of the data provided by such surveys: 
if the Universe is comparable to a huge unique experiment, 
how can we determine the uncertainties in the measurement 
of quantities derived from observing it? One strategy con¬ 
sists of dividing the observations into subvolumes, treating 
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each of the subsamples as independent measurements, and 
computing the errors with jackknife or bootstrap estimates. 
While this approach continues being relevant as a way to 
obtain error estimates directly from the data (see e.g. Nor- 
berg et al. 2009), it also implies several disadvantages. First, 
it does not include systematic errors present in all subvol¬ 
umes, secondly it does not lead to a physical understanding 
of the data by itself, and thirdly it introduces variance be¬ 
yond the one already present in the data on scales larger 
than the subvolumes. The last point is especially critical 
when the signal sought has a large characteristic scale and 
its detection significance crucially depends on the volume, 
as is the case for baryon acoustic oscillations (BAOs; see 
e.g. Seo & Eisenstein 2005; White et al. 2009). During the 
past decades, there has been a huge effort to encode our 
physical knowledge of structure formation in computational 
algorithms, and compare the theoretical models to the ac¬ 
tual observations. Pioneering works started with qualitative 
comparisons (see e.g. Klypin & Shandarin 1983; Blumen- 
thal et al. 1984; Davis et al. 1985). Since then simulations 
have grown and such comparisons have turned increasingly 
more quantitative (see e.g. Klypin et al. 2003; Springel et al. 
2005; Boylan-Kolchin et al. 2009; Klypin et al. 2011). These 
efforts are essential to understand structure formation and 
yet they suffer from a strong limitation: as simulations al¬ 
ways push the computational limits they are not suited for 
massive production. In fact the number of current state-of- 
the-art large volume A^-body simulations is of order 10 (Kim 
et al. 2009; Angulo et al. 2012; Prada et al. 2012; Alimi et al. 
2012; Watson et al. 2014; Fosalba et al. 2015; Skillman et al. 
2014; Klypin et al. 2014; Ishiyama et al. 2015). However, 
an ideal approach to determine the uncertainties from cur¬ 
rent and upcoming surveys scanning large sky areas, and 
hence covering huge volumes, such as BOSS^ (White et al. 
2011), DESlVSigBOSS (Schlegel et al. 2011), DES^ (Frie- 
man & Dark Energy Survey Collaboration 2013), LSST ^ 
(LSST Dark Energy Science Collaboration 2012), J-PAS® 
(Benitez et al. 2014), 4MOST® (de Jong et al. 2012), or 
Euclid^ (Cimatti et al. 2009; Laureijs 2009), requires thou¬ 
sands of such simulations if the simplest error determination 
methods are used (Dodelson & Schneider 2013; Taylor et al. 
2013; Percival et al. 2014). Alternative more efficient meth¬ 
ods need to be considered to face this challenge. A few pio¬ 
neering works explored a viable strategy more than a decade 
ago relying on simplified fast gravity solvers using pertur¬ 
bation theory: pinocchio (Monaco et al. 2002, 2013) and 
PTHALOS (Scoccimarro & Sheth 2002). Nevertheless, these 
methods are not trivial, need calibration with A-body sim¬ 
ulations, and still demand high computational efforts. For 
this reason, some of the first analysis of large surveys (Per¬ 
cival et al. 2001; Cole et al. 2005) was done based on lognor¬ 
mal realizations (see also Percival et al. 2004; Beutler et al. 
2011), which match the two-point statistics by construction 

^ http://www.sdss3.org/surveys/boss.php 
^ http://desl.lbl.gov/ 

^ http://wuw.daLrkeiiergysurvey.org 
^ http://www.lsst.org/lsst/ 

® http://j-pas.org/ 

® http://www.aip.de/en/research/research-area-ea/ 
research-groups-and-projects/4most 
^ http://www.euclid-ec.org 


(Coles & Jones 1991), although their three-point statistics is 
very different from the true one (see e.g. White et al. 2014; 
Chuang et al. 2015b). It is also not clear that their four- 
point statistics will be accurate (Cooray & Hu 2001; Takada 
& Hu 2013). 

The analysis of past data releases of the BOSS collab¬ 
oration utilized 1,000 mocks, created based on an improved 
version of pthalos (Manera et al. 2013, 2015). The use 
of approximate gravity solvers in these methods came at 
the expense of only matching clustering statistics on a wide 
range of scales to ~ 10% precision (and strongly deviating 
towards small scales ^20 h~^ Mpc). 

This sets the agenda for the current BOSS data release 
DR11&DR12 and the requirements for a new generation of 
mock galaxy catalogues. Ideally one would like to base these 
on efficient solvers that are trained on exact solutions and 
deliver a comparable precision. A new generation of methods 
that can meet these high requirements have been developed 
during the past two years, in particular, patchy (Kitaura 
et al. 2014), QPM (White et al. 2014), and ezmocks (Chuang 
et al. 2015a). The key concept exploited by these methods 
is to rely only on the large-scale density field obtained from 
approximate gravity solvers and use biasing prescriptions 
to populate it with mock galaxies, in a similar way to the 
methods proposed to augment the resolution of A-body sim¬ 
ulations (de la Torre & Peacock 2013; de la Torre et al. 2013; 
Angulo et al. 2014; Ahn et al. 2015). One should however 
be careful, as computing an accurate dark matter field is 
a necessary, but not sufficient condition to reproduce the 
correct halo/galaxy three-point statistics. The bias param¬ 
eters are degenerate in the two-point statistics and need to 
be additionally constrained to reproduce higher order statis¬ 
tics (Kitaura et al. 2015). We will rely in this work on the 
PATCHY method due to its verified accuracy in the two and 
three-point statistics for different populations of objects (see 
application of the hadron code to patchy and ezmock; 
Zhao et al. 2015). An additional set of galaxy mocks fit¬ 
ting the BOSS DR11&DR12 (CMASS and LOWZ) data at 
two mean redshifts (respectively) based on QPM have been 
produced in an unprecedented effort. These are constructed 
with a different structure formation model based on low res¬ 
olution particle mesh solvers, and a different galaxy bias, 
based on a rank-ordering scheme assigning most massive 
objects to the highest density peaks, (for a comparison of 
both sets of catalogues see §3 and Gil-Marln et al. 2015a). 

Another approach uses approximate PT based solutions 
to speed up A-body solvers (see COLA method, Tassev et al. 
2013; Hewlett et al. 2015; Koda et al. 2015). This method 
is very promising to generate ensembles of reference mock 
catalogues; however, it has the drawback of requiring large 
computational memory for the force calculation and large 
number of particles to resolve the haloes (see Chuang et al. 
2015b), and is therefore not suitable for the massive pro¬ 
duction aimed in this work. The speed of the method over 
A-body simulations comes at the expense of not resolving 
the sub-structures required to produce a realistic galaxy cat¬ 
alogue. This problem can be circumvented by, e.g., augment¬ 
ing the missing objects with the halo occupation distribu¬ 
tion model, hereby losing some of the advantage of having 
a higher precise description of the nonlinear clustering over 
the above mentioned methods which rely only on the large 
scale dark matter field, as shown in a comparison study (see 
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Chuang et al. 2015b, and references therein). One may need 
an approach like COLA, to model the large-scale structure, 
combined with the galaxy bias presented in this work for 
future emission line galaxy-based surveys. We will, however, 
demonstrate here that this is not necessary to model the 
distribution of luminous red galaxies (LRGs) aimed in this 
work. 

One could argue whether mock catalogues are required 
at all, as analytical models may deliver an almost direct 
computation of error bars and covariance matrices (Hartlap 
et al. 2007; Hamaus et al. 2010; Dodelson & Schneider 2013; 
Taylor et al. 2013; Kalus et al. 2016). It still remains to be 
shown that these methods making simple assumptions, such 
as that the density field is Gaussian distributed, yield the 
same accuracy as covariance matrices based on large sets of 
mock catalogues. 

Nevertheless, the purpose of mock catalogues is mani¬ 
fold, as they not only serve to provide error estimates, but 
also to provide an understanding of the systematics of the 
survey and of the methodology. Any analytical prediction 
or data analysis method should be cross-checked with large 
ensembles of mock galaxy catalogues for which the products 
of this work could be useful. One clear example is the case 
of BAO reconstruction techniques (see e.g. Eisenstein et al. 
2007; Padmanabhan et al. 2012; Anderson et al. 2014; Ross 
et al. 2015). 

We exploit the efficiency and accuracy of the patchy 
code to produce 12,288 galaxy mock catalogues® includ¬ 
ing the lightcone evolution of galaxy bias based on the 
halo abundance matching technique applied to the ref¬ 
erence BigMultiDark A^-body simulation (see Rodrfguez- 
Torres et al. 2015, companion paper), and to the peculiar 
motions based on the observational data, matching the two, 
three-point statistics, in real and redshift space of the BOSS 
DR11&DR12 galaxy clustering data at different redshifts 
and for arbitrary stellar mass bins. Special care has been 
taken to include all relevant observational effects including 
selection functions and masking. The MultiDark patchy 
BOSS DRll mock catalogues presented in this work are 
publicly available®. 

This paper is structured as follows: in section §2 we de¬ 
scribe the methodology. This section starts with the genera¬ 
tion of the reference catalogue using Wbody simulations and 
the halo abundance matching technique. Subsequently the 
scheme to massively generate mock catalogues is described. 
Then we show in §3 the statistical comparison between the 
mock catalogues and the BOSS DR12 data. Subsequently we 
discuss future work (§4). Finally, in section §5 we present the 
conclusions. The reader interested only in the results may 
skip §2 and directly go to §3. 


® This corresponds to an effective volume of 
~192,000 Gpc]®, a factor of ~ 20 times larger than 

the volume of the DEUS FUR simulation (Alimi et al. 2012), and 
a factor of ~ 375 times larger than the DarkSky dsl4 simulation 
(Skillman et al. 2014). 

® http://data.sdss3.org/sas/drll/boss/lss/drll_patchy_ 
mocks/ 

The BOSS DR12 mock catalogues will be made pub¬ 
licly available together with the data catalogue: http: 
//data.sdss3.org/sas/drl2/boss/lss/drl2_patchy_mocks/. 


2 METHODOLOGY 

To construct high-fidelity mock light cones for interpreting 
the BOSS DR11&DR12 galaxy clustering, we adopt an it¬ 
erative training procedure in which a reference catalogue 
is statistically reproduced with approximate gravity solvers 
and analytical-statistical biasing models. The whole algo¬ 
rithm involves several steps and is summarized in the flow 
chart in Fig. 1. 

(i) The first step consists of the generation of an accurate 
reference catalogue. Here we rely on a large A^-body simula¬ 
tion capable of resolving distinct haloes and the correspond¬ 
ing substructures. This permits us to apply the HAM tech¬ 
nique to reproduce the clustering of the observations with 
only one parameter: the scatter in the stellar mass-to-halo 
mass relation (see Rodrfguez-Torres et al. 2015, companion 
paper; and §2.1). This technique is applied at different red- 
shift bins to obtain a detailed galaxy bias evolution spanning 
the redshift range covered by BOSS DR11&DR12 galaxies. 
In this way we obtain mock galaxy catalogues in full cubical 
volumes of 2.5 h~^ Gpc side at different redshifts. 

(ii) In the second step we train the patchy code (Kitaura 
et al. 2014, 2015) to match the two- and three-point cluster¬ 
ing of the full mock galaxy catalogues for each redshift bin. 
Here we consider all the mock galaxies together in a single 
bin irrespectively of their stellar mass. 

(iii) In the third step we apply the hadron code (Zhao 
et al. 2015) to assign stellar masses to the individual objects. 

(iv) In the fourth step we apply the SUGAR code (see 
Rodrfguez-Torres et al. 2015, companion paper) which in¬ 
cludes selection effects, masking, and combines different 
boxes at different redshifts into a light-cone. 

(v) In the fifth step the resulting MultiDark patchy 
mock catalogues are compared to the observations. The pro¬ 
cess is iterated until the desired accuracy for different sta¬ 
tistical measures is reached. 

In the next sections we will describe in detail these steps 
described above for the massive generation of accurate mock 
galaxy catalogues. The reader interested only in the results 
may directly go to §3. 


2.1 Reference mock catalogues 

The reference catalogues are extracted from one of the Big¬ 
MultiDark simulations^® (Klypin et al. 2014), which was per¬ 
formed using gadget-2 (Springel et al. 2005) with 3, 840® 
particles on a volume of (2.5/i“®Mpc )® assuming A cold 
dark matter Planck cosmology with {Hm = 0.307115, fib = 
0.048206, (Jg = 0.8288, Us = 0.9611}, and a Hubble constant 
{Ho ~ 100 hkm s“® Mpc“®) given hy h = 0.6777. Haloes 
were defined based on the Bound Density Maximum halo 
finder (Klypin & Holtzman 1997). 

We rely here on the HAM technique to connect haloes 
to galaxies (Kravtsov et al. 2004; Neyrinck et al. 2004; Tasit- 
siomi et al. 2004; Vale & Ostriker 2004; Conroy et al. 2006; 
Kim et al. 2008; Guo et al. 2010; Wetzel & White 2010; 
Trujillo-Gomez et al. 2011; Nuza et al. 2013). 


http://www.multidark.org/MultiDark/ 
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MultiDark patchy Mocks for BOSS DR11&DR12: training mock catalogs from observed and simulated data sets 



*: the asterisc indicates the steps in which calibration with observations and simulations is required 


ALPT: Augmented Lagrangian Perturbation Theory: 
Kitaura & Hell (2013) 

BigMultiDark: A-body Planck simulation 

(2.5 h~^ Gpc)^ with 3,840^ particles: Klypin et al. (2014) 

HADRON: Halo mAss Distribution ReconstructiON: 

Zhao et al. (2015) 


HAM: Halo Abundance Matching, Rodriguez-Torres et al. (2015, com¬ 
panion paper) 

MockFactory: White et al. (2014) 

PATCHY: PerturbAtion Theory Catalog generator 

of Halo and galaxY distributions: Kitaura et al. (2014, 2015) 

SUGAR: survey GenerAtoR, Rodriguez-Torres et al. (see 2015, compan¬ 
ion paper) 

this code contains HAM and MockFactory 


Figure 1. Flowchart of the methodology applied in this work for the generation of high fidelity BOSS DR11&DR12 mock galaxy 
catalogues: i) starting from a reference mock catalogue calibrated with the observations, 11) followed by the reproduction of the whole 
catalogue, iii) with the subsequent mass assignment, iv) and survey generation, v) The final catalogues are compared with the observations 
and the simulation, and the previous steps are repeated until the mock catalogues are compatible with the observations within l-tr for 
the monopole and quadrupole up to fe ~ 0.3/iMpc“^. 


We note that there are alternative methods connect¬ 
ing haloes to galaxies like the HOD model, which we are 
not going to consider here (e.g., Berlind & Weinberg 2002; 
Kravtsov et al. 2004; Zentner et al. 2005; Zehavi et al. 2005; 
Zheng et al. 2007; Skibba & Sheth 2009; Ross & Brunner 
2009; Zheng et al. 2009; White et al. 2011). These methods 
are based on a statistical relation describing the probability 
that a halo of virial mass M hosts N galaxies with some 
specified properties. In general, theoretical HODs require 
the fitting of a function with several parameters, which we 
want to avoid here. 

At first order HAM assumes a one-to-one correspon¬ 
dence between the luminosity and stellar or dynamical 
masses: galaxies with more stars are assigned to more mas¬ 
sive haloes or subhaloes. The luminosity in a red-band is 
sometimes used instead of stellar mass. There should be 
some degree of stochasticity in the relation between stel¬ 


lar and dynamical masses due to deviations in the merger 
history, angular momentum, halo concentration, and even 
observational errors (Tasitsiomi et al. 2004; Behroozi et al. 
2010; Leauthaud et al. 2011; Trujillo-Gomez et al. 2011). 
Therefore, we include a scatter in that relation necessary 
to accurately fit the clustering of the BOSS data (see 
Rodriguez-Torres et al. 2015, companion paper). To do this, 
we modify the maximum circular velocity (Hmax) of each 
object adding a Gaussian noise: = Hmax(l -I-^(0, ct)), 

where ^(0, a) is a Gaussian random number with mean 0, 
and standard deviation a. Then, we sort all objects by Vj^ax* 
and then, we selected objects starting from the one with 
larger and we continue until we get the proper num¬ 

ber density at different redshifts bins. 

By construction, the method reproduces the observed 
luminosity function (or stellar mass function). It also repro¬ 
duces the scale dependence of galaxy clustering over a large 
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range of epochs (Conroy et al. 2006; Guo et al. 2010). When 
abundance matching is used for the observed stellar mass 
function (Li & White 2009), it gives also a reasonable fit to 
lensing results (Mandelbamn et al. 2006) and to the relation 
between stellar and virial mass (Guo et al. 2010). 


2.2 Generation of mock galaxy catalogues 


All covariance matrix estimates based on a finite number 
of mock catalogues, W, are affected by noise, which must 
be propagated into the final constraints. The impact of the 
uncertainties in the covariance matrix on the derived cosmo¬ 
logical constraints has been subject of several recent analy¬ 
ses (Dodelson & Schneider 2013; Taylor et al. 2013; Perci- 
val et al. 2014). In particular, Dodelson & Schneider (2013) 
showed that this additional uncertainty can be described by 
a rescaling of the parameter covariances derived from the 
distribution of measurements from a set of mocks with a 
factor given by 


(iV, - W - 2) (iVb - ATp) 

"" ^ {N, - - l){Ns - Ny, - 4)’ 


( 1 ) 


where A^b is the number of bins in the corresponding clus¬ 
tering measurements and Np is the number of parameters 
measured. This implies that a large number of mock cat¬ 
alogues are necessary for a robust analysis of the galaxy 
clustering data. 

For the anisotropic BAG measurements of Cuesta et al. 
(2015) the estimation of the full covariance matrix of the 
monopole and quadrupole of the two-dimensional correla¬ 
tion function from the ensemble of 1,000 QPM corresponds 
to an additional uncertainty of 2% on the constraints on 
H{z)rd and DAO/rd- Using the 2,048 MultiDark patchy 
mock catalogues, the effect is reduced to the order of 1%. 
Large sets of catalogues are even more important for full- 
shape fits of anisotropic clustering measurements, where the 
inclusion of information from smaller scales can significantly 
improve the constraints based on redshift space distortions 
(RSD; requiring a larger number of bins). For example, in 
the analysis of Sanchez et al. (in prep.), based on measure¬ 
ments of the clustering wedges statistic (Kazin et al. 2012), 
the use of mock catalogues corresponds to a rescaling of the 
parameter covariances by m = 1.04 and 1.085 when using 
1,000 or 2,048 catalogues, respectively. This additional un¬ 
certainty corresponds to a degradation of the true constrain¬ 
ing power of the clustering measurements, which should be 
minimized by using a larger number of mock catalogues. 
For this reason we have made the effort in the BOSS col¬ 
laboration of producing at least 1,000 mocks for each BOSS 
DR11&DR12 sub-sample. 

The strategy for the massive production of mock galaxy 
catalogues relies on generating dark matter fields with ap¬ 
proximate gravity solvers on a mesh. We use grids of 960^ 
cells with volumes of (2.5 h~^ Gpc)^ and resolutions of 
2.6 h~^ Mpc for which the structure formation model can 
be considered to be accurate §2.2.1. Then the galaxies are 
populated on the mesh according to a combined nonlinear 
deterministic §2.2.2 and stochastic bias model §2.2.3. In a 
post-processing step we assign halo/stellar masses to each 
object §2.2.5. Finally we apply the survey geometry and se¬ 
lection functions §2.2.6. 

Let us start describing the patchy code (PerturbAtion 


Theory Catalog generator of Halo and galaxY distribu¬ 
tions). 

2.2.1 Approximate fast structure formation model 

We rely on augmented Lagrangian Perturbation Theory 
(ALPT) to simulate structure formation. Let us recap the 
basics of this method and refer for details to Kitaura & 
Hefi (2013). In this approximation the displacement field 
^(q,2). mapping a distribution of dark matter particles at 
initial Lagrangian positions q to the final Eulerian positions 
x{z) at redshift 2 {x{z) = q + ^{q, z)), is split into a long- 
range ^i^{q,z) and a short-range component 4's(q,2), i.e. 
^{q,z) = ^-L{q,z) + ^s(q,2). 

We rely on second order Lagrangian Perturbation The¬ 
ory (2LPT) for the long-range component ^’ 2 LPT(for details 
on 2LPT see Buchert 1994; Bouchet et al. 1995; Catelan 
1995). 

The resulting displacement field is filtered with a kernel 
IC\ ^i,{q,z) = IC{q,rs) o ^ 2 LPT(q, 2 ). We apply a Gaus¬ 
sian filter KL{q,rs)= exp (—|q|^/(2r|)), with rs being the 
smoothing radius. We use the spherical collapse approxi¬ 
mation to model the short-range component 'S'sc(<7,2) (see 
Bernardeau 1994; Mohayaee et al. 2006; Neyrinck 2013). The 
combined ALPT displacement field 

^ALPT(q, z) = K.{q, rs)o^' 2 LPT(q, z)+{l - Af(q, rs))o’®'sc(<7, z) 

( 2 ) 

is used to move a set of homogenously distributed parti¬ 
cles from Lagrangian initial conditions to the Eulerian final 
ones. We then grid the particles following a clouds-in-cell 
scheme to produce a smooth density field One may 

get some improvements preventing voids within larger col¬ 
lapsing regions, which essentially extends the collapsing re¬ 
gion towards moderate underdensities (see MUSCLE method 
in Neyrinck 2016). This approach requires about eight ad¬ 
ditional convolutions being about twice as expensive, as the 
approached used here. Moreover, we have checked that the 
improvement provided by including MUSCLE is not percepti¬ 
ble when using grids with cell sizes of 2.6 h~^ Mpc. 

2.2.2 Deterministic bias relations 

In this section we describe the deterministic part of our bias 
model. This is combined with a stochastic element, described 
in §2.2.3, and a nonlocal element, described in §2.2.5, to 
produced the full model. The deterministic bias relates the 
expected number counts of haloes or galaxies pg = {Ng}ev 
at a given finite volume to the underlying dark matter field 
Pm, with ([• • ■ ])9v being the ensemble average over the dif¬ 
ferential volume element dV (in our case the cell of a regu¬ 
lar mesh). This relation is known to be nonlinear, nonlocal 
and stochastic (Press & Schechter 1974; Peacock & Heavens 
1985; Bardeen et al. 1986; Fry & Gaztanaga 1993; Mo & 
White 1996; Dekel & Lahav 1999; Sheth & Lemson 1999; 
Seljak 2000; Mo & White 2002; Berlind & Weinberg 2002; 
Smith et al. 2007; Desjacques et al. 2010; Beltran Jimenez & 
Durrer 2011; Valageas & Nishimichi 2011; Elia et al. 2012; 
Ghan et al. 2012; Baldauf et al. 2012, 2013; Ahn et al. 2015). 

In general this bias relation will be arbitrarily complex: 

Pg = /g B{pm), (3) 
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with B{pm) being a general bias function, /g = ’ 

(pg)v being the number density iVg, and ([• ■ ■ ]}v being the 
ensemble average over the whole considered volume V (in 
our case the volume of the mesh). 

The deterministic bias model we consider in this work 
has the following form: 


Ps = fs ^{pM - Pth) exp 
with 



Pm (Pm — Pth)’", 


(4) 


/g = Ng/{S{pM - Pth) exp 



Pm (Pm — Pth)^)v, 


(5) 


and {pth, a, e, Pe, r} the parameters of the model. We have 
modelled threshold bias (Kaiser 1984; Bardeen et al. 1986; 
Cole & Kaiser 1989; Sheth et al. 2001; Mo & White 2002) as 
a combination of a step function 6{pM — Pth) (Kitaura et al. 
2014) and an exponential cut-off exp j^— j (Neyrinck 

et al. 2014). The local bias expansion (Cen & Ostriker 1993; 
Fry & Gaztanaga 1993) is summarized by a power-law (de 
la Torre & Peacock 2013; Kitaura et al. 2014). In addition 
we consider a bias (pM — Pth)’^ which compensates for the 
missing power of PT based methods. 

Nonlocal bias has been recently found to be relevant 
(McDonald & Roy 2009; Baldauf et al. 2012; Chan et al. 
2012; Sheth et al. 2013; Saito et al. 2014). A non-local bias 
introduces a scatter in the local deterministic bias relations 
described above. In this work, the scatter is first described by 
a stochastic bias relation (see §2.2.3). We have investigated 
second order nonlocal bias with patchy without finding that 
this can have a relevant effect on the mock catalogues con¬ 
sidering stochastic bias and the full (one single mass bin) 
catalogue (see Autefage et al. in prep.). In fact once one 
considers different populations of halo or stellar mass ob¬ 
jects, then nonlocal bias plays an important role. We solve 
this in a post-processing step when assigning the masses to 
each galaxy (see §2.2.5 and Zhao et al. 2015). 


2.2.3 Stochastic biasing 

The halo distribution is a discrete sample of the con¬ 
tinuous underlying dark matter distribution pg,;: 

Ag,i P{Ngg I Pg,i,{pSB}) , (6) 

for each cell i and {psb} being the set of stochastic bias pa¬ 
rameters. To account for the shot noise one could do Pois- 
sonian realizations of the halo density field as given by the 
deterministic bias and the dark matter field (see e.g. de la 
Torre & Peacock 2013). However, it is known that the excess 
probability of finding haloes in high density regions gener¬ 
ates over-dispersion (Somerville et al. 2001; Casas-Miranda 
et al. 2002). 

The strategy up to now has been to generate a mock 
catalogue which reproduces the clustering of the whole pop¬ 
ulation of galaxies for a given redshift. This has the advan¬ 
tage that by mixing massive and low mass galaxies we will 
always be dominated by overdispersion, which is much eas¬ 
ier to model than underdispersion. In particular we consider 


the negative binomial (NB) probability distribution func¬ 
tion (for non-Poissonian distributions see Saslaw & Hamil¬ 
ton 1984; Sheth 1995) including an additional parameter /3 
to model over-dispersion (tends towards the Poisson proba¬ 
bility distribution function for /3 —>■ oo and for low A values). 

We note that a proper treatment of the deviation from 
Poissonity is also crucial to get accurate density reconstruc¬ 
tions (see Ata et al. 2015, and Ata et al in prep.). 

We will need, however, to take care of the different sta¬ 
tistical nature of each population of galaxies when we assign 
masses to each object (see §2.2.5). 


2 . 2.4 Redshift space distortions 

Let us recap here the way in which RSDs are treated in the 
PATCHY-code (see Kitaura et al. 2014). 

The mapping between Eulerian real space x{z) and 
redshift space s{z) is given by: s(z) — x{z) -f Vr(z}, with 
Vr = {v ■ r)r /{Ha)\ where r is the unit sight line vector, H 
the Hubble constant, a the scale factor, and v = v{x) the 
3-d velocity field interpolated at the position of each halo in 
Eulerian-space x using the displacement field T’alpt(< 7,-z)- 
We split the peculiar velocity field into a coherent and a 
(quasi) virialized component v^: v = + v’’ . The coher¬ 

ent peculiar velocity field is computed in Lagrangian-space 
from the linear Gaussian field S^^\q) using the ALPT for¬ 
mulation consistently with the displacement field (see Eq. 2): 


WALPT(q, 2 ) = ^(<7, rs)oi»2LPT(<7, 2)-|-(l - ]C{q, rs))ovsc{q, z ), 

(7) 

with D 2 LPT(q, 2 ) being the second order and vsc{q, 2 ) being 
the spherical collapse component (for details see Kitaura 
et al. 2014). 

We use the high correlation between the local density 
field and the velocity dispersion to model the displacement 
due to (quasi) virialized motions. Effectively, we sample a 
Gaussian distribution function (Q) with a dispersion given 
by (T„ oc (1 -I- ^alpt^alpt . Gonsequently, 

vf = {v‘’ ■ r)r/{Ha) = Q (^g x -|- (a;)^ ^ f . (g) 

For the Gaussian streaming model see Reid & White 
(2011), for non-Gaussian models see e.g. Tinker (2007). In 
closely virialized systems the kinetic energy approximately 
equals the gravitational energy and a Keplerian law predicts 
7 close to 0.5, leaving only the proportionality constant g as 
a free parameter in the model (see also Hefi et al. 2013). 
We assign larger dispersion velocities to low mass objects 
considered to be satellites. The parameters g and 7 have 
been adjusted to fit the damping effect in the monopole and 
quadrupole as found in the BigMultiDark A^-body simula¬ 
tion first and later further constrained with the BOSS DR12 
data for different redshift bins (see discussion in §3). 

2.2.5 Halo/stellar mass distribution reconstruction 

Once we have a spatial distribution of objects {^g} which 
accurately reproduce the clustering of the whole galaxy sam¬ 
ple at a given redshift, we assign the halo/stellar masses 
Mg to each object I according to the statistical informa¬ 
tion extracted from the BigMultiDark simulation using the 
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Figure 2. Pie plot of the BOSS DR12 observations (upper left region), and one MultiDark PATCHY mock realization (lower right 
region). 


Halo mAss Distribution ReconstructiON (hadron) code 
(for technical details see Zhao et al. 2015). In particular 
we sample the following conditional probability distribution 
function 

Mg^ P(Mg|{rg},pM,r, Ar“„,{pc},2), (9) 

where pivi is the local density, T the tidal field tensor (in 
particular the eigenvalues), Ar^j^ a minimum separation 
between massive objects due to exclusion effects, {pc} a set 
of cosmological parameters, and z the redshift at which we 
want to apply the mass reconstruction. We note that at this 
stage we consider nonlocal biasing through the tidal held 
tensor and the minimum separation of objects. Using all 
this information it has been proven that one can recover 
compatible clustering for arbitrary halo mass cuts with the 
A-body simulation up to scales of about k = 0.3 h~^ Mpc 
(Zhao et al. 2015). We extend the algorithm to stellar masses 
including the rank ordering relation and scatter described in 
S2.1. 


2.8.6 Survey generator 

The survey GenerAtoR (sugar) code is an openMP code 
which constructs light-cones from mock galaxy catalogues 
(see Rodriguez-Torres et al. 2015, companion paper). This 
code applies geometrical features of the survey, including 
the geometry (using the publicly available mangle mask; 
Swanson et al. (2008)), sector completeness, veto masks and 
radial selection functions. 

The SUGAR code can construct light-cones using a sin¬ 
gle box or multiples boxes at different redshifts, in order 
to include the redshift evolution in the hnal catalogue. The 
hrst step in the construction of the lightcone is to locate the 
observer (z = 0) and to transform from comoving Cartesian 
coordinates to equatorial coordinates (RA,Dec) and redshift. 
To compute the observed redshift (redshift space) of an ob¬ 
ject, hrst we compute the comoving distance from the ob¬ 
server to the object, and then we transform it to redshift 
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space following s = rc + {v ■ f)/a-ff(«reai) (see §2.2.4), where 

■^real ^ 

rdz) is computed from r^z) = f , f ^ . 

Once we compute the redshift of each galaxy, we con¬ 
sider two options to select objects in the radial direction: 

(i) downsampling: this option preserves the clustering of 
the input box selecting objects randomly to have the desired 
number density. 

(ii) selecting by halo property: this consists of rank order¬ 
ing objects by a halo property and selecting them sequen¬ 
tially until the correct number density is obtained. 


3 RESULTS: STATISTICAL COMPARISON 

BETWEEN THE MULTIDARK PATCHY 

MOCKS AND THE BOSS DR12 DATA 

Following the method described in §2 we generate 12,288 
mock light-cone galaxy catalogues for BOSS DR12^^ (2,048 
for each LOWZ, CMASS, combined, southern and northern 
galactic cap). We call these catalogues MultiDark patchy 
mocks, MD PATCHY mocks in short. The corresponding com¬ 
putations required about 500,000 CPU hours (30-50 min for 
each box on 16 cores and a total of 40,960 boxes). Since each 
PATCHY-fHADRON-run requires less than 24 Gb shared mem¬ 
ory for a grid with 960^ cells, we were able to make use of 128 
nodes with 32 Gb each in parallel from the BSC Marenos- 
trum facilities, taking about one week wall clock time for 
all 40,960 catalogues. The light-cone generation with SUGAR 
required an additional ~1,000 CPU hours. The equivalent 
computations based on A-body simulations would have re¬ 
quired about 9,000 million CPU hours (~2.3 million CPU 
hours for each light-cone). The effective number of particles 
is ~(61,440)^ (given that the reference catalogue required 
3, 840^ particles to resolve the objects we reproduce in the 
MD PATCHY catalogues). 

We used 10 redshift bins to construct the light-cones. 
This permits us to obtain the galaxy bias, the growth, and 
the peculiar motion evolution as a function of redshift. A 
visualization of the BOSS DR12 and one MD patchy mock 
realization is shown in Fig. 2. We can clearly see from this 
plot that both the data and the mocks follow the same se¬ 
lection criteria including the survey mask (the colour code 
stands for the stellar mass), and there are no obvious vi¬ 
sual differences beyond cosmic variance. The empty regions 
seem to be similarly distributed for both cases, indicating 
that the three-point statistics should be close, and the sta¬ 
tistical comparison between the MD patchy mock galaxy 
catalogues and the observations of BOSS DR12 yield good 
agreement. The number densities for LOWZ and CMASS 
galaxy samples are recovered by construction (see Fig. 3). 
We investigate the performance of the mock galaxy cata¬ 
logues in detail in the following subsections. 

To avoid redundancy we show only the results for BOSS 
DR12, as the only difference with respect to the BOSS DRll 
mocks is the applied mask and selection function. 


We have produced half the amount of mock catalogues for 
DRll, i.e., 1,024 for each LOWZ, CMASS, combined, southern 
and northern galactic cap. 


3.1 Two-point and three-point correlation 
functions 


We perform first an analysis in configuration space com¬ 
puting the two- and three-point correlation functions. To 
compute the clustering signal in the correlation function for 
the MD PATCHY mock lightcones and the observed data we 
rely on the Landy & Szalay (1993) estimator. We will follow 
their notation referring to the data sample (either simula¬ 
tion or observed data) as D and to the random catalogue 
as R. The correlation function is then constructed in the 
following way: 




DD - 2DR -h RR 
RR 


( 10 ) 


as a function of separation between pairs of galaxies in red¬ 
shift space s. 

The three-point correlation function gives a description 
of the probability of hnding three objects in three different 
volumes, and can be computed following Szapudi & Szalay 
(1998) 


C(si2, S 23 , Sia) 


DDD - 3DDR + 3DRR - RRR 
RRR 


( 11 ) 


as a function of separation between the vertices of triangles 
spanned by triplets of galaxies in redshift space si2, S23, S13. 

Fig. 4 shows that we accurately recover the clustering 
(monopole) for arbitrary stellar mass bins showing an al¬ 
most perfect agreement with observations. Only for the two 
largest stellar mass bin, we find deviations larger than l-cr. 
This is mainly due to the “halo exclusion effect”, which is 
only approximately modelled, assuming a minimum separa¬ 
tion for massive galaxies, and not the full separation distri¬ 
bution function (Zhao et al. 2015). We find, however, that 
these differences are not critical, as they are restricted to 
small scales (^20h“^ Mpc) and only a low number of ob¬ 
jects are affected. We further compute the monopole and 
quadrupole for LOWZ and CMASS (see Fig. 5 and §3.3). 
The monopole agrees towards small scales down to a few 
Mpc within l-cr. 

There is a deviation of the monopole around the BAO 
peak and towards larger scales. While the galaxy mock cata¬ 
logues cross zero right after the BAO peak, the observations 
do not. In this study, we have applied all of the systematic 
weights, such as the stellar density contamination, detailed 
in Reid et al. (2016) and Ross et al. (in prep.). The corre¬ 
lation function measurements are quite covariant between s 
bins at these scales, making the deviations less significant 
than one would expect by the visual impression. The sig¬ 
nificance and potential causes of the large-scale excess are 
studied in Ross et al. (in prep.), where it is also shown that 
it has no significant impact on BAO measurements. This 
is even more so, as the overall shape of ^(s) in BAO mea¬ 
surements is marginalized over with a polynomial (see e.g. 
Anderson et al. 2014). See also Ross et al. (2012); Chuang 
et al. (2013) for similar studies on an earlier BOSS data set 
and Huterer et al. (2013) for potential photometric calibra¬ 
tion systematics, which have not been accounted for in this 
analysis. 

In the case of RSD measurements one has to make sure 
that the analysis is performed on scales which are not af¬ 
fected by systematics (Gil-Marfn et al. 2015a, companion 
paper). The quadrupole is in very good agreement on all 
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Figure 3. Number density for the LOWZ (left) and CMASS (right) samples. The observations are given by the blue solid lines. The 
shaded contours represent the l-<j regions according to the MD PATCHY mocks. 
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Figure 4. Monopole for different stellar mass bins as indicated in the legend with the corresponding colour code. The error bars 
represent the BOSS DR12 data. The shaded contours represent the 1-cr regions according to the MD PATCHY mocks. 


scales, further supporting that RSD analysis should be safe, 
even in case there are some remnant systematics in the data. 

An investigation of the three-point function demon¬ 
strates that the MD patchy mocks have a quality very sim¬ 
ilar to those based on V-body simulations after calibration 
(see left-hand and central panels in Fig. 6). We have con¬ 
strained the galaxy bias parameters (see §2.2.2, 2.2.3, and 
2.2.5) based on the reference catalogues from the BigMul- 
tiDark simulation on cubical full volumes at each of the 10 
redshift bins, matching the two- and the three-point statis¬ 
tics. To fit the latter we focused on matching the higher 
order correlation functions through the probability distri¬ 
bution function of galaxies in the reference catalogues fol¬ 
lowing the approach presented in Kitaura et al. (2015). Us¬ 
ing the observations to constrain the three point statistics 
is not trivial, due to incompleteness effects. This explains 
why the MD patchy mock catalogues better fit the refer¬ 
ence catalogue than the data, especially for the CMASS 
galaxies. The three-point statistics performs worse for the 
QPM mocks, possibly because they do not include an iter¬ 
ative validation step fitting higher order statistics (beyond 
the two-point correlation function). The nonlinear RSD pa¬ 
rameter (see §2.2.4) was iteratively constrained based on the 
observations, as we explain in the next section. 


3.2 Monopole and quadrupole in Fourier space 

The galaxy power spectrum P and the galaxy bispectrum B 
are the two- and three-point correlation functions in Fourier 
space. Given the Fourier transform of the galaxy overdensity, 
= Pg (x)/pg - 1, 

5g(k) = yd®x5g(x) exp(— ik ■ x), (12) 

where Pg(x) is the number density of objects and pg its 
mean value, and the galaxy power spectrum and galaxy bis¬ 
pectrum are defined as, 

(5g(k)5g(k')) = {2nfPik)S°{k + k'), (13) 

(Pg(kl)5g(k2)<5g(k3)) = (27r)®B(ki,k2)<5°(ki-bk2-bk3), 

(14) 

with being the Dirac delta function. Note that the bis¬ 
pectrum is only well defined when the set of fc-vectors, fci, 
fc 2 and ks close to form a triangle, ki -|- k 2 -I- k 3 = 0. It is 
common to define the reduced bispectrum Q as, 

11, 1, '1 = _ ^(ki.kz) _ 

u^[ai2\ 1 , - p(^k^)P{k2) + P{k2)P{k3) + P{kl)P{k3)■ 

{ 15 ) 
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Figure 5. Monopole (on the left) and quadrupole (on the right) for LOWZ and CMASS in the first and second rows, respectively. The 
shaded contours represent the 1-cr regions according to the MD patchy mocks, correlation function in red, quadrupole in blue. 
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Figure 6. Left-hand and central panels: three-point statistics comparing the MD PATCHY mocks (blue shaded region) with the BigMulti- 
Dark mocks of the 7V-body simulation (red shaded region) and the observations (black error bars) for LOWZ (left) and CMASS galaxies 
(central). Right-hand panel: three-point statistics comparing the QPM mocks (LOWZ: blue shaded region, CMASS: red shaded region) 
to the observations (LOWZ: black error bars, CMASS: green error bars). Corresponding ratios are shown in the bottom panels. Shaded 
area shows 1-a uncertainties, ri = 10 and r 2 = 20 h~^ Mpc and 8 is the angle between ri and r 2 h~^ Mpc. 


where ai 2 is the angle between ki and k 2 . This quantity 
is independent of the overall scale k and redshift at large 
scales and for a power spectrum that follows a power law. 
Moreover, it presents a characteristic “U-shape” predicted 
by gravitational instability. Mode coupling and power law 
deviations in the actual power spectrum induce a slight 


scale- and time-dependency in this quantity. However, in 
practice it has been observed that at scales of the order of 
k ~ 0.1 ZiMpc”^ the reduced bispectrum does not present a 
high variation in its amplitude. 

The measurement of the bispectrum is performed in 
the same way as the approach described in Gil-Man'n et al. 
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Figure 7. Monopole (red) and quadrupole (blue) in Fourier space for the LOWZ (left) and CMASS galaxies (right) for the mean over 
2,048 MD PATCHY mocks for both southern and northern galactic caps, the average and 1-a uncertainties are shown. The results for QPM 
(1,000 mocks for each LOWZ/CMASS, and north/south) are shown with dashed magenta lines. The error bars assigned to the data 
points have been computed based on 2,048 MD PATCHY mocks. The ratio plots in the bottom panels have been only done for the MD 
PATCHY mocks. 




Figure 8. Monopole (on the left) and quadrupole (on the right) before and after BAO reconstruction (see Vargas-Magana et al. in 
prep.). The error bars represent the BOSS DR12 data. The solid lines correspond to the mean, and the shaded contours represent the 
1-(T regions, according to the MD PATCHY mocks (red pre-, and blue post-reconstruction). 


(2015c). This method consists of generating /c-triangles and 
randomly orientating them in fc-space. When the number 
of random triangles is sufficiently large, the mean value of 
their bispectra tends to the fiducial bispectrum (for details 
see Gil-Marm et al. 2015c). 

Discreteness adds a shot noise contribution to the mea¬ 
sured power spectrum and bispectrum, fn this paper we as¬ 
sume that these contributions are of Poisson type and there¬ 
fore are given by, 

Psn(fe) = - (16) 

n 

Bsn(ki,k2) = -lP{0)+P{k2)+P{k3)] + \{n) 

n 

where fca = |ki + k 2 | and n is the number density of haloes. 

For both power spectrum and bispectrum we present 
the BOSS DR12 data error-bars computed from the disper¬ 
sion among 2,048 and 100 realizations of MD patchy mock 
catalogues, respectively. 


The Fourier space analysis has been used to improve 
the modelling of the RSD in the galaxy mock catalogues. 
We have assigned higher peculiar random motions to about 
10 % of the galaxies to ht the quadrupole of the data with a 
specific value for each of the 10 redshift bins. The resulting 
monopoles and the quadrupoles show a good agreement with 
the observations over the range relevant to BAOs and RSDs 
up to at least k ~ 0.3 h~^ Mpc for both LOWZ and CMASS 
(see Fig. 7). This agreement is further supported after BAO 
reconstruction, as can be seen in Fig. 8. Only towards the 
very large scales (k ^ 0.02 h~^ Mpc) we can find that the ob¬ 
served monopole tends to be larger than the mock catalogues 
(both MD PATCHY and QPm). This hints towards the discrep¬ 
ancy in the monopole found in configuration space (see the 
previous section). Although the patchy method can poten¬ 
tially yield accurate two-point statistics up to fc ~ l/i“^ 
Mpc (see Kitaura et al. 2014; Chuang et al. 2015b), we have 
restricted the study to lower ks, as the analysis of BAOs 
and RSDs will not be done beyond k = 0.3 h~^ Mpc, and 
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Figure 9. Bispectra and reduced bispectra for LOWZ mocks and observed galaxies for different configurations indicated above each 
panel. The red solid line corresponds to the mean and the red shaded region to the 1-cr contour of 100 MD PATCHY mocks. The black dots 
correspond to the BOSS DR12 data with the error bars taken from the MD PATCHY mocks. 


the computation of power spectra for thousands of mocks 
with large grids becomes very expensive. 

This fitting procedure had, however, as a consequence 
that the three-point correlation function is slightly less pre¬ 
cise at angles close to 0 ~ 0 and 0 ~ tt, as can be seen in 
Fig 6, which prior to this operation was fully compatible 


with the reference catalogue. In fact the reference BigMul- 
tiDark catalogue used in this study showed a highly dis¬ 
crepant quadrupole, as compared to the observations. This 
has been deeply analysed and a better agreement has been 
found based on an improved HAM procedure applied to 
the BigMultiDark simulation (see Rodrfguez-Torres et al. 
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Figure 10. Bispectra and reduced bispectra for CMASS mocks and observed galaxies for different configurations. The red solid line 
corresponds to the mean and the red shaded region to the l-cr contour of 100 MD PATCHY mocks. The black dots correspond to the BOSS 
DR12 data with the error bars taken from the MD PATCHY mocks. 


2015, companion paper), which however was not available 
at the moment of the generation of the MD patchy mocks. 
The HOD model adopted in the QPM mock catalogues as¬ 
sumed about 10% satellite galaxies. This yields a compatible 
quadrupole for the CMASS galaxies. However, as these cat¬ 
alogues were not iteratively calibrated for different redshift 


slices, their agreement with the LOWZ galaxies is less accn- 
rate. 

A detailed analysis of the bispectra is presented in 
Figs. 9 and 10 demonstrating a reasonable agreement be¬ 
tween the mocks and the observations for different configu¬ 
rations of triangles across a wide range of scales, given the 
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Figure 11. Monopole and the quadrupole for different redshift bins over the redshift range 0.15 < z < 0.7. The black error bars stand 
for the BOSS DR12 data. The shaded contours represent the 1-cr regions according to the MD PATCHY mocks in blue and according to 
the QPM mocks in red. These measurements are used in the BAO and RSD analysis in Chuang et al. (in prep.). 
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Figure 12. Monopole showing the evolution for LOWZ. The 
corresponding redshift bins for the PATCHY mocks are represented 
by shaded regions, and the observations by the error bars. 


3.3 Cosmic evolution 

The cosmic evolution modelled in the MD patchy mocks 
was achieved by fitting the clustering of 10 redshift bins for 
the full redshift range spanning about 5 Gyr. This implied 
running structure formation with ALPT for each redshift, 
i.e., modelling the growth of structures and the growth rate, 
and additionally fitting the galaxy bias evolution and the 
nonlinear RSDs. The evolution of clustering for both sets of 
mocks in the full redshift range is shown in Fig. 11. While 
the correlation function for CMASS galaxies does not show 
strong differences along the CMASS redshift range, this evo¬ 
lution is very apparent for the LOWZ sample. Fig. 12 shows 
the comparison between the mocks and the observations for 
different LOWZ in more detail. The QPM mocks do not in¬ 
clude a detailed cosmic evolution within LOWZ or CMASS 
being based on mean redshifts for each case. This explains 
why these mocks lose accuracy in the two-point statistics 
towards low redshifts. 

We investigate now the cosmic evolution of the covari¬ 
ance matrices derived from the MD patchy mocks^^ corn- 


high uncertainties introduced by the mask, selection func¬ 
tion, and cosmic variance. Covariance matrices for the different catalogues (LOWZ, 
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Figure 13. Cosmic evolution of the correlation matrices for different redshift bins indicated in the legend in bins of 5 h~^ Mpc. Lower 
left block for the monopole, upper right block for the quadrupole, and upper left and lower right blocks for the correlations between the 
monopole and the quadrupole. See §3.3 for details of the calculation. These correlation matrices are used in the BAO and RSD analysis 
in Chuang et al. (in prep.). 


puted as in Anderson et al. (2014): 

E,(d - (d))«j - (^j)) 


cov[i, j] = 


N.-l 


(18) 


with bins i and j, mock sample I, and Na being the number 
of simulations. 

The correlation matrices for different redshift bins 


CMASS, and combined sample) will be made publicly available 
with the publication of the galaxy catalogue. 


shown in Fig. 13 were constructed upon the covariance ma¬ 
trices following 




covlkj] 


(19) 


We find that the correlation matrices vary in subsequent 
redshift bins. First, the correlation matrices are increasingly 
correlated close to the diagonal for both the monopole and 
the quadrupole towards lower redshifts, as expected from 
gravitational evolution coupling different scales. This is seen 
in Fig. 13 as the diagonal red band becomes broader es- 
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Figure 14. Angular correlation functions based on the combined sample. Left panel: Angular auto-correlation function on small scales 
for two different tomographic bins (see key for redshift ranges), where colour bands are the mean and 1-cr region of MD PATCHY and 
symbols correspond to the measurements on the DR12 combined sample. Central panel: angular cross-correlation function on small scales 
between different tomographic bins, following the same key as the left-hand panel. Right-hand panel: large-scale angular auto-correlation 
function for two different redshift bins. These measurements are used in the tomographic analysis of galaxy clustering in Salazar-Albornoz 
et al. (in prep.). 




Figure 15. Multipole moments based on the combined sample: monopole Pq, quadrupole P 2 and hexadecapole P 4 for different redshift 
bins (see key for redshift ranges), where colour bands are the mean of MD PATCHY and symbols correspond to the measurements on the 
DR12 combined sample. These measurements are used in the wedges analysis of galaxy clustering in Grieb et al. (in prep.). 


pecially comparing the highest redshift bin with the lower 
ones. Secondly, we hnd that moderate off-diagonal corre¬ 
lations present at higher redshifts disappear towards lower 
redshifts. And thirdly, we can see that the correlation be¬ 
tween the monopole and the quadrupole at large scales be¬ 
comes maximal in the redshift bin 0.43 < z < 0.55, as can 
be seen in the white region in the lower right and upper left 
blocks. This “triangular” correlation is expected from linear 
theory (see Eqs. 7 and 9 in Chuang & Wang 2013). 

Further calculations of the correlation functions includ¬ 
ing QPM mocks are shown in companion publications (Gil- 
Marin et al. 2015a,b, companion papers). 

Additionally we show in Fig. 14 the angular correlation 
function and in Fig. 15 the multipole moments (including 
the hexadecapole) for different redshift bins based on the 
combined sample showing good agreement between the MD 
PATCHY mocks and the data. 


4 FUTURE WORK 

We have taken advantage in this survey of the character¬ 
istic bias of LRGs, being massive objects residing in high 
density regions. This work conhrms that threshold bias is 
an essential ingredient to explain the clustering of LRGs. 
This facilitates our analysis, since the low density filamen¬ 
tary network did not need to be accurately described, and it 
has permitted us to rely on low resolution (augmented La- 
grangian) PT based methods. This will no longer apply for 
upcoming surveys based on emission line galaxies residing 
in the whole cosmic web. One could improve the method¬ 
ology presented in this work by substituting the structure 
formation model based on PT with a more accurate one 
(e.g. cola). Whether this is necessary, or whether more ef- 
hcient alternative approaches are sufficient (e.g. ALPT with 
MUSCLE corrections), will be investigated in future works. 

Nonlocal bias was only considered in the mass assign- 
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ment step, but neglected in the generation of the full galaxy 
population. This may become important to model for emis¬ 
sion line galaxies, and needs a deeper analysis. 

The approximate “halo exclusion” modelling is mainly 
responsible for the deviation in the clustering of the most 
massive objects, and could be improved by taking their full 
distribution of relative distances, instead of taking a sharp 
minimum separation for each mass bin, as is done here. 

Another aspect which still needs to be improved in the 
catalogues is the clustering on sub-Mpc scales. We have ran¬ 
domly assigned positions of dark matter particles to the 
mock galaxies without considering that some of them are 
satellites of central galaxies. This implies that these mocks 
are not appropriate for fibre-collision analysis. For the time 
being we will leave the mock catalogues as they are, since 
most of the studies are not affected by this. Nevertheless, 
we would like to stress that this aspect can easily be cor¬ 
rected by assigning to a fraction of the mock galaxies close 
positions to the major most massive ones in the neighbour¬ 
hood, without the need of redoing the catalogues. The QPM 
mocks better model hbre collisions, as the HOD adopted in 
this work successfully reproduced the fraction of close satel¬ 
lites and central galaxies (Gil-Man'n et al. 2015a, companion 
paper). 

Also the photometric calibration systematics, presum¬ 
ably responsible for the excess of power in the data towards 
large scales, require further investigation. 

We have considered one hducial cosmology. It would be, 
however, interesting to provide sets of mock catalogues run¬ 
ning over different combinations of cosmological parameters. 

Let us finally mention that we have ignored in this study 
super-survey modes, which may be especially relevant for the 
analysis of the power spectrum at very large scales (Takada 
& Hu 2013; Li et al. 2014a,b; Carron & Szapudi 2015). 

We aim at addressing all these issues in future works. 


5 SUMMARY AND CONCLUSIONS 

We have presented 12,288 mock galaxy catalogues for the 
BOSS DR12, including all relevant physical and observa¬ 
tional effects, to enable a robust analysis of BAOs and RSDs. 

The main features of these mock catalogues are as fol¬ 
lows: 

• large number of catalogues: 2,048 for each LOWZ, 
CMASS, and combined LOWZ-I-CMASS and northern and 
southern galactic cap, 

• accurate structure formation model on scales of a few 
Mpc, 

• accurate galaxy bias model including nonlinear, 
stochastic, threshold bias, and a nonlocal bias dependence 
on the tidal field tensor and the exclusion effect separation 
of massive objects, 

• modelling redshift evolution of galaxy bias, growth of 
structures, growth rate, and nonlinear RSDs, 

• and additional survey features, such as geometry, sector 
completeness, veto masks and radial selection functions. 

The same degree of accuracy is achieved for the BOSS 
DRll MD PATCHY mocks, for which only 6,000 lightcone 
mock catalogues were produced (1,000 for each LOWZ, 


CMASS, and combined LOWZ+CMASS and northern and 
southern galactic cap). 

The MD PATCHY mocks have shown a better match to 
the data than the QPM mocks in terms of two- and three- 
point statistics. Investigating the origin for these differences 
can be interesting as the physical models, and in particular 
the galaxy bias, adopted in each method are quite different. 

We note that neglecting the stochastic bias considered 
in the MD patchy mocks, modelling the deviation from Pois¬ 
son shot noise (predominantly over-dispersion), could under¬ 
estimate the clustering uncertainties. 

The mock catalogues have enabled a robust analysis 
of the BOSS data yielding the necessary error estimates 
and the validation of the analysis methods. In particular 
the studies include the following: 

• a full clustering analysis (Sanchez et al. in prep., Grieb 
et al. in prep.: see Fig. 15), 

• a tomographic analysis of the large-scale angular galaxy 
clustering, where full light-cone effects (e.g. growth, bias and 
velocity field evolution) are essential (Salazar-Albornoz et 
al. in prep.: see Fig. 14), 

• a study of the BAOs reconstructions (see Vargas- 
Magana et al. in prep., and Fig. 8 showing the performance 
on the MD PATCHY mocks), 

• and a RSD analysis (Gil-Marin et al. 2015a, companion 
paper; Beutler et al in prep.). 

We have demonstrated that the MD patchy BOSS 
DR12 mock galaxies match, in general within l-cr, the clus¬ 
tering properties of the BOSS LRGs for the monopole, 
quadrupole, and hexadecapole of the two-point correlation 
function both in configuration and Fourier space. In par¬ 
ticular we achieve a high accuracy in the modelling of the 
monopole up to A: ~ 0.3hMpc“^. We have furthermore 
shown that we also obtain three-point statistics with the 
same level of accuracy as A-body based catalogues at scales 
larger than a few Mpc, which are close to the observations. 

The good agreement between the models and the obser¬ 
vations demonstrates the level of accuracy reached in cos¬ 
mology, our understanding of structure formation, galaxy 
bias, and observational systematics. 

All the mock galaxy catalogues and the corresponding 
covariance matrices will be made publicly available together 
with the release of the BOSS DR12 galaxy catalogue. 
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